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1. INTRODUCTION 

As akind of high mobility, low cost aircraft, unmanned aerial vehicle (UAV) has been widely used in civil 
and military fields. In the military fields, UAV has used in the suppression of enemy air defense (SEAD) missions. 
However, with the increasing requirements of complex environments and situations, individual UAV can hardly 
reach the ideal result. Thus, heterogeneous UAVs with different operational capabilities are needed. For effectively 
finishing the mission, team cooperation of multiple UAVs is extraordinarily significant. In essence, the task 
allocation problem is a complex and NP-hard combinatorial optimization problem [1], in which computation cost is 
exponentially increased proportionally to the number of variables. In general, there are two methods for solving 
complex combinatorial optimization problems: traditional methods and intelligent optimization methods. 

As for traditional ones, dynamic programming, and mixed integer linear methods, have been used in 
solving this problem. Optimal solutions can be obtained through traditional methods in small-scale problems. 
However, in large-scale problems, it is difficult to get the optimal solution in a reasonable time. The intelligent 
optimization methods mainly refer to a series of newly emerging algorithms called a meta-heuristic algorithm, 
which is inspired by the nature concepts like animal behaviors, physical phenomena, and so on. Different from 
the traditional methods, the metaheuristic algorithm cannot always obtain the exact optimal solution since its 
randomness. Instead, what is worked out by the algorithm is an approximate optimal solution in a reasonable 
time [2]. Several meta-heuristic algorithms have already been used to solve the task allocation problems in 
SEAD, like particle swarm optimization (PSO) [3], genetic algorithm (GA) [4], and anti colony optimization 
(ACO) [5]. Since it’s free from gaining substantial gradient information, those algorithms are more efficient 
than traditional methods. 


Journal homepage: http://ijeecs.iaescore.com 


578 0 ISSN: 2502-4752 


According to the no free lunch (NFL) theorem, different metaheuristic algorithms suited for different 
optimization problems [6]. Many meta-heuristic algorithms with novel search mechanisms have been 
proposed. They all seek a balance between exploration and exploitation in the search process [7]. The 
exploration process is more inclined to search the whole search space, which aims at discovering the region 
where the optimal solution may exist [8]. Grey wolf optimizer (GWO) is one of the meta-heuristic 
optimizations which was proposed by Seyedali Mirjalili in 2014 [9]. Due of its simple implementation, flexible 
use, and fast convergence, GWO has been widely used in various optimization problems like feature selection 
[10], structural damage identification [11], forecasting electric loads [12], path planning [13], and so on. Like 
most meta heuristic algorithms, GWO also seeks a balance between exploration and exploitation. However, 
during the update process, all search agents move toward a globally optimal solution, which allows for rapid 
convergence of GWO but leads to poor population diversity and easy to falls into local optimal when dealing 
with large-scale optimization problems [14], [15]. To remedy the defect of GWO, in this article, an improved 
GWO (IGWO) with a congestion control strategy based on population control and a global best search strategy 
based on random search was proposed. Then the IGWO was applied to solving the multiple UAVs task 
allocation. The experimental results proved that the IGWO proposed in this paper has more advantages on the 
large-scale multiple UAVs task allocation problem. 


2. MODEL 
2.1. Multiple UAVs task allocation model 
2.1.1. Basic model definition 

Tables | and 2 lists the parameter settings of the problem. The parameters are defined based on the 
works in [16]-[18]. There are N, UAVs in the heterogeneous UAV system Uj, j = {1,2,...N,} and N, targets in 
the T;.i = {1,2,..Nt} with a two-dimension position L; = (x;, y;) . Each target contains three types of tasks 
(k = 1,2,3 for reconnaissance, attack, and verification) that need to be performed sequentially. t, indicates the 
performing time of the k-type task. The difference between UAVs is mainly characterized by the equipment 
for performing the different tasks, which is indicated Aj k = 1,2,3, by the j UAV’s ability to perform 
reconnaissance, attack, and verification tasks. Correspondingly, the 1 target’s demand of the ability for 
performing the tasks is indicated as Al ke k = 1,2,3. For simplicity, the value of the ability belongs to [0,1] only 


in the value of the UAVs ability is larger than the target demanding ability, then tasks can be performed. 


Table 1. Attributes of targets and tasks 


Model Attribute Parameter 
Target, T; | Number of the targets N, 
Target location Li= %, yi) 
Demanding ability Abn kek 
Task, My i Number of the tasks Ny 
Task type K = {1,2,3} 
Performing time thik EK 


Table 2. Attributes of UAVs 


Model Attribute Parameter 
Number of UAVs Ny 
UAY, U; Velocity Vy 


Executive ability Abn kek 


2.1.2. Mathematical models 
Based on the above considerations, the mathematical model is shown as follows: 


min ( max, (tn) + Lupeu.ier Touget + Trt “i 
ST. UM x#, =1, 1€T,.k=2 om 
Au; wk * Xik 2 Al ke k = 1,2,3 a 


Indonesian J Elec Eng & Comp Sci, Vol. 30, No. 1, April 2023: 577-585 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 im) 579 


eel, h=123 (a) 


max ‘(one tint) + tr PT yk =0 
tik = (5) 
max (tory PTSuyp a tS ne tint) tt, PTu,n #9 


In (1)iis the sequence number of the target andk is the task type. k = 1,2,3indicate respectively 
reconnaissance, attack, and verification tasks. t; , is the completion time of the i target and the k task. u; ,is the 
sequence number of UAV that performs the i target and k task. PTy, , is the sequence number of previous targets 


of the UAV uj. Tou; ,,i 1S the time of the UAV u;,, flight from the airport to the target. Tpr,,,,i is the time of 
the UAV u,, flight from the previous target to the i target. In (2) and (3), x4, is a binary decision variable. 


When x;/,=1, it presents the UAV u performing the i target and the j task, otherwise x;4,=0. In (2) constrains 
UAVs can only attack once, and (3) guaranteed each task for each target is performed only once by one UAV. 
In (4), Ais pkiS the ability of UAV u;, to perform k type task and Al Ais the demanding ability of the 7 target. 
It’s noted that to calculate the latest finish time of the task, every task’s time for each target needs to be 
calculated by (5). PT,,, = Oindicates UAV has no previous target. PTSy, ,is the previous task type of the 


target i. 


2.2. The original GWO 
2.2.1. The inspiration 

This paper is based on the GWO optimizer which is inspired by the grey wolf packs hunting behavior 
and their strict social hierarchy.The social hierarchy is shown in Figure 1. a6, and 6 are the leader classes. w is 
the subordinate class. During the hunting, the w wolf will follow the navigation of the leaders. Hunting 
behavior can be divided into three main processes: searching, encircling, and attacking. In the searching 
process, the packs will search for the prey in the territory. When finding prey, the a wolf will direct the other 
wolves to encircle and harass the prey to consume its endurance. When the prey is exhausted, the packs will 
give the prey the last attack. 


rN 

EBD». 
7 a ee 
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Figure 1. Hierarchy of grey wolf 


2.2.2. Mathematical model 
The (6)-(9) are used to update the position of the wolves for the next generation 


r, € [0,1], v2 € [0,1], a=2(1--) (6) 
A= 2am, —a, C =2r (7) 
D = |CX,(0) - X(@)| (8) 
X(t +1) =X,(t) —A-D (9) 


In (6), 7, and 7, are two different n-dimension random vectors between 0 and | and a is the tuning 
parameter for exploration and exploitation which is decreased linearly from 2 to 0 over iterations. A and C are 
the adjusted vector that can generate disturbance to imitated uncertainties. The GWO assumes that the leader 
wolves have more information about the prey what is means they are closer to the prey than the w wolves. 
Under the guidance of the leader wolves, other wolves approach the prey continuously until catch it. This 
process can present in: 


Dg = |CXq(t) —X(t)|, Dg = |CXg(t) — XO], Ds = ICXs(t) — XO) (10) 
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X, = Xq(t) — Ag: Da, X2 = Xp(t) — Ag - Dg, Xz = Xg(t) — Ag: Dg (11) 
X(t+1)= aa (12) 
3. METHOD 


So far, there are many studies to enhance the performance of GWO, mainly including three ways: 
adjustment strategy for parameters, integrating operators, and combining GWO with other metaheuristic 
algorithms. In terms of adjusting parameters, Meidani et al. [19] proposed AGWO to enhance the performance 
by modifying the parameter. Jitkongchuen et al. [20] introduce the weighted parameters for three leader wolves 
for control the influence of each leader to improve the ability to escape the local optimal. In terms of integrating 
operators, Heidari and Pahlavani [21] introduce the Lévy flight strategy into GWO to enhance exploration, 
Gupta and Deep [22] added the random walk strategy to GWO to avoid the local optimal. As for combining 
with other existing meta-heuristic algorithms, Singh and Singh [23] combined GWO with PSO and proposed 
a hybrid meta-heuristic algorithm named HGWOPSO. Tawhid and Ali [24] combined GWO with GA and 
achieved good results. In this paper, we introduced two strategies to enhance the performance of GWO and the 
details are described below. 


3.1. Congestion control strategy 

For avoiding premature convergence in the early stage, agents need to explore the search space as 
much as possible. Keeping the wolves at a distance is a viable approach. It’s the same in the natural situation, 
when the wolves encircle the prey, they will keep their distance from each other to avoid hurting themselves 
[25]. By the inspiration of this, for keeping the distance between the leader wolves and other wolves, a threshold 
is settled. When the distance is small than the threshold, the position will be reset randomly in the search space. 
The threshold is as follows: 


w = —| ——1——__] + (w; + wy) (13) 
14( Bia )xereo 
Wee 
r = 0.01 x (logyo w; — logio Wr) (14) 
w; = 0.05 x (ub — lb) (15) 
_ (X(t) d<w 
MG Gee d>w m6) 
d = |X(t) — X,(0)| (17) 


where w is the threshold to control the distances. In (13). w; is the initial value of w and it depends on the upper 
and lower bounds. w, is the final value of w, which is related to the accuracy of the problem. ris the step length, 
and t is the iteration. To describe the distance between individuals, Euclidean distance d is used as a measure. 
The position resets when the distance d is less than w. 


3.2. Global best search strategy 

During the search process, the w wolves' position updating mainly depends on the guidance of three 
leading wolves, which will make the newly selected leading wolves very likely to be near the position of the 
previous leading wolves, which can make the algorithm converge quickly, but it also limits the exploration 
ability of the algorithm and makes the algorithm easy to fall into a locally optimal solution. To overcome this 
drawback, an update phase is introduced to the leading wolves. At this stage, the leading wolves will perform 
a random search, and if it finds a more optimal position, it will update the position otherwise it will unchanged. 
The updated formula is as follows: 


xj = x; + (2u—1)(ub; — 1b;), ue [0,1],i = 1,2,---,dim (18) 


in (18), x; is the i component of leading wolves. u is a random value in [0,1], wb; and Lb; are the upper and 
lower boundaries. 
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4. RESULTS AND DISCUSSION 
4.1. Experimental setup 

The experiment was implemented in MATLAB R2019a. Experiments were performed on a PC with 
a 3.00 GHz, Intel(R) Core(TM) i7-9700 CPU. Four meta-heuristic algorithms, PSO [26], GWO [9], ACO [27], 
and DE [28], were tested on two different scale examples to compare with the proposed IGWO. Each algorithm 
was run independently 10 times, and the optimal solution, the worst solution, the variance, the mean value, and 
the running time of the results were taken as evaluation indexes. The parameter Settings of each algorithm are 
shown in Table 3. The specific parameters of three examples can get in https://github.com/sameleer/UAVS. 


Table 3. Parameters setting for experiments 


Algorithm Parameter Value 
PSO Acceleration constants (c,, C2) [1.5,2.0] 
Inertia weights (w) [1,0.99] 
DE Crossover probability (p,) 0.8 
Differential weight 0.5 
GWO a a was linearly decreased from 2 to 0 
ACO Pheromone Exponential Weight (a) 1 
Evaporation Rate (rho) 0.1 
IGWO a a was linearly decreased from 2 to 0 


4.2. Result analyze 

Example | is a small-scale example, including 5 mission targets and 8 UAVs. The experimental results 
are shown in Table 4 and Figures 2 and 3. It can be seen from Table 4, in the best situation, the gap between 
the algorithms is not very large. However, in the worst case, GWO and PSO will fail to obtain a feasible 
solution. This is because it is trapped in the local optimal solution in the search process. However, the improved 
IGWO does not fall into the local optimal solution, and gives a feasible solution even in the worst case, which 
indicates that our improved strategy is effective, and it enhances the ability of the original GWO to jump out 
of the local optimal solution. DE has the best performance, which is ahead of other algorithms in terms of best, 
worst, mean, and time. The running time of IGWO is longer than other algorithms because of its higher 
computational complexity. The specific allocation scheme can be seen from Figure 3. Where Figures 3(a)-(e) 
represent the solution results of IGWO, GWO, PSO, ACO and DE, respectively. The vertical 
coordinatesrepresents the number of the UAV, the horizontal axis represents the time, and the most reasonable 
allocation scheme is given by DE. Example 2 is a large-scale example, including 30 mission targets and 50 
UAVs. The experimental results are shown in Table 5, Figures 4 and 5. As can be seen from the results, as the 
scale of the problem increases, the difficulty of solving it also increases. Except for IGWO, other algorithms 
do not give feasible solutions. Experimental results show that IGWO has distinct advantage. 


Table 4. The result of example | 
Algorithm Best Worst Mean Std Time(S) 
IGWO 88.5193 99.1645 93.6945 3.8762 7.745 
GWO 105.5721  1105.9565* 216.4815 312.5892 5.352 


PSO 105.598 1100.8524* 214.9853 311.4252 5.085 
ACO 97.9968 105.9746 102.4901 2.8058 6.72 
DE 87.7699 96.7998 91.6986 3.2435 4.871 
12000 ;-— T | iGwo 
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Figure 2. The convergence curve of example | 
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Figure 3. The best result of example 1: (a) IGWO, (b) GWO, (c) PSO, (d) ACO, and (e) DE 


Table 5. The result of example 2 
Algorithm Best Worst Mean Std Time 
IGWO 3468.4639 4632.3932 3840.5823 367.3512 59.886 
Gwo 48156.1822* 146032.6772* 98867.2849  36823.326 22.568 
PSO 19179.4908* 163847.6439*  41248.0035 43582.1682 21.189 
ACO 10317.7007* = 12512.2719* = 11529.5041 712.6525 34.463 
DE 37765.628* __50173.7063* __42871.0893 _-3746.2556 —_—-22.029 
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Figure 4. The convergence curve of the example2 
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Figure 5. The best result of IGWO in example 2 


5. CONCLUSION 

In this paper, the congestion control strategy and the global best search strategy are introduced into 
GWO to remedy the poor population diversity and premature convergence. Then, the proposed IGWO was 
applied to solve the multiple UAVs task allocation problem. The following conclusions can be drawn from 
experiments on three examples of different scales: i) the introduced congestion control strategy ensured the 
diversity of the population by resetting the positions of individuals too close to the leader wolves, ii) the 
introduced global optimal search strategy improves the exploration performance of the algorithm by adding a 
random search phase of the leading wolves and enhances the ability of the algorithm to jump out of the local 
optimal solution, and 1ii) IGWO performs well in small-scale examples and shows certain advantages in large- 
scale examples, which indicates that IGWO is more suitable for large-scale optimization problems. However, 
the IGWO still has shortcomings. Due to the added search process of the leader wolves, the computational 
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complexity increased. This makes it take more time compared to other algorithms. So the future work we will 
consider reducing its computational complexity to make it solve large-scale optimization problems faster. 
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